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trum is important for verifying models of structure formation, especiafly in 
view of forthcoming high-resolution observations. We argue that higher-order 
statistics of inter-scale correlations, because of their low cosmic variance, may 
be effective in detecting non-Gaussian features in the CMB. Inter-scale corre- 
lations are generically produced in defect-based models of structure formation. 



CN ' We analytically study properties of general higher-order cumulants of Fourier 

■ components of homogeneous random fields and design a new set of statistics 

o 



suitable for small-scale data analysis. Using simulated non-Gaussian fields, 
we investigate the performance of the proposed statistics in presence of a 
Gaussian background and pixel noise, using the bispectrum as the underly- 
ing cumulant. Our numerical results suggest that detection of non-Gaussian 



^ . features by our method is reliable if the power spectrum of the non-Gaussian 

components dominates that of the Gaussian background and noise within at 

least a certain range of accessible scales. 
98.70.V;98.80.C 



I. INTRODUCTION 

The fluctuations in the cosmic microwave background (CMB) radiation flrst reliably 
measured by the COBE are one of today's most cosmologically important measurements 
(see for a recent review). The future satellite missions MAP |Q] and PLANCK are 
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expected to provide high-resolution, low-noise CMB data that will strongly constrain cur- 
rent theories of structure formation. One of the key differences between the competing 
theories is an approximately Gaussian distribution of density fluctuations produced in most 
inflationary theories versus the generically non-Gaussian inhomogeneities in scenarios based 
on topological defects. A precise statistical analysis of the CMB fluctuations is needed to 
distinguish non-Gaussian signatures of cosmic defects P] and small deviations from Gaus- 
sianity in inflationary universe due to generic effects such as a non-linear evolution [^0, 
gravitational lensing [§, and higher-order couplings during recombination f^. The power 
spectrum, which is the statistic easiest both to predict theoretically and to extract from 
data, is by definition insensitive to non-Gaussian correlations. For this reason, and espe- 
cially in view of forthcoming observations, detection of non-Gaussian features in the CMB 
is an important task. 

Since all theories give only statistical predictions of the CMB, one's conclusions from 
a single observation of the microwave sky are by necessity probabilistic. Many criteria of 
Gaussianity have been proposed in the literature and applied to the available data, mostly 
yielding results consistent with the Gaussian hypothesis []rU[. Although some claims of find- 
ing a non-Gaussian signal in COBE have been advanced recently ]TT|-p!^, not all of them 
are equally persuasive and some have been ascribed to data contamination and noise in 
Refs. |lT5| , p!6[| . Apart from these problems, the usefulness of any CMB statistic is fundamen- 
tally limited by cosmic variance. Therefore we would like to look for statistical descriptors 
of non-Gaussian signal that have a naturally low variance. 

Cosmic variance manifests itself differently in real space and in the Fourier space where 
the CMB temperature fluctuations are usually decomposed into modes aim using the spher- 
ical harmonic expansion 




l,m 



(1) 



The power spectrum estimator 
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is increasingly more precise at smaller angular scales (higher I). In other words, the cosmic 
variance in the Fourier space is shifted away from small scales toward large scales. One 
would therefore expect any statistics of Fourier modes at very small scales to be relatively 
free of cosmic variance. On the other hand, real space data is correlated and estimators 
based on correlations of AT {6, 0) are equally affected by cosmic variance throughout all 



angular scales. (The recently introduced wavelet-based statistics p!7|JT3[1 can be regarded 



as occupying the middle ground between the real and the Fourier space descriptors. See 



also |T8| for an example of a low-variance real-space statistic which is however strongly 
power-spectrum dependent.) 

This motivates us to consider small-scale Fourier modes in hopes of obtaining a sensitive 
statistic. Previous research [l^ suggests that individual Fourier modes are hkely to be 



nearly Gaussian distributed because of the central limit theorem. Therefore we intend to 
investigate inter-scale correlations in Fourier space for the role of non-Gaussian indicators. 
A non-Gaussian signal may manifest itself as a correlation between Fourier modes aim of 
different scales /; this extra correlation must be of third order or higher because homogeneity 
requires {aimalim') ^ ^ii'^mm'- The third-order correlator of the Fourier modes, called the 
"bispectrum" , has been extensively studied in the literature [pOH 2^J^ and recently applied 



to the COBE data 1 11,13]. It was shown that the bispectrum carries signatures from cosmic 



defects, as well as from non-linearities of evolution in inflationary models. Defects generically 
produce non-Gaussian features also in higher-order correlations, as we shall show. 

We would like to reformulate the problem of finding the cross-correlation between some 
chosen scales I and I' as a Gaussianity test on a suitable distribution. Consider for simplicity 
a random field in flat (two-dimensional) space, decomposed into Fourier modes a^. We may 
regard the set of mode values Ok, Ok' at the scales |k|, |k'| as a sample of a two- variable 
distribution {a, a'} and proceed to test that distribution for Gaussianity. A general way of 
testing a distribution for Gaussianity is by using cumulants (see e.g. p4|). In our case we 



need to employ suitable multi-variable cumulants, such as a third-order cumulant 

X(ki,k[,k2) ^ (aki ) , (3) 

with |ki| = |k'i| 7^ |k2|. Such cumulants describe non-Gaussian correlations between per- 
turbations at different scales. Since the variance of a cumulant estimator is decreased when 
the number of points in the sample grows, we expect the statistic to be increasingly more 
sensitive on smaller scales (larger I). We shall demonstrate for a Gaussian field that Fourier 
space cumulant estimators of any order n, which we denote xi^i, ■■■,kn), at small scales 
(large ki) are statistically approximately independent and themselves approximately Gaus- 
sian distributed. We also find the variances of these cumulant estimators of any order. This 
significantly simplifies the likelihood analysis, since the variances are theoretically known 
and the estimators can be normalized to have unit variances. 

In other words, if the underlying CMB map were Gaussian, all quantities xiki, ■■■,kn) 
for all scales ki form (after normalization) a sample of independent realizations of a normal 
distribution. One could test this hypothesis by choosing an appropriate range of accessible 
scales ki, combining all estimators x{ki, kn) at these scales into one sample and computing 
the first few cumulants ujj (we take j — 1, 5) of that sample. If the Gaussian hypothesis 
holds for the original distribution, the quantities ujj are in turn Gaussian distributed with 
known means and variances. We now propose this as a Gaussianity test for the original 
map. Likelihood analysis of the descriptors ujj is at least as strong as a simultaneous fit of 
all inter-scale cumulants and is potentially more discriminating. 

In this method, one is free to choose a subset of cumulants x {ki, kn) and a subset of 
scales ki at which to evaluate them. Since the statistical variance of the cumulant estimators 
is decreased at smaller scales (larger ki) but grows as n! with the order n of the cumulants, 
the most promising results should come from the lowest cumulants, such as the bispectrum 
{n — 3), and at smallest available scales. In this initial investigation we use only the 
bispectrum to build the estimators uJn- 

To evaluate the efiiciency of the method, we use several simulated non-Gaussian fields 
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inspired by defect-based structure formation models. One of the simulated models consists a 
superposition of fixed temperature profiles centered on a random set of Poisson distributed 
points. The advantage of this model is that for any shape of the profile and for any dis- 
tribution of the profile intensity and size, one can analytically obtain the full generating 
functional of the resulting random field, which allows (in principle) to evaluate any statistic. 
In particular, it is straightforward to estimate the expected inter-scale correlations of Fourier 
modes of this field. Another model we used is a simulated CMB map from cosmic strings 
25| , superimposed on Gaussian background and pixel noise. Our goal in all simulations 



was to find the levels of Gaussian backgrounds at which the non-Gaussian signal is still 
detectable with our statistic, given a certain level of noise. 

The outline of the paper is as follows. In Sec. |I| we define the particular cumulants that 
describe inter-scale correlations of the Fourier modes of a random field and show how we 
estimate them from a single field realization. We derive the means and variances of these 
cumulant estimators assuming Gaussian random field, as would be needed for likelihood 
analysis, and show that the error bars decrease for large /, as expected. Details of the 
calculations are given in Appendices ^ and ^. In Sec. |T| we describe a model non-Gaussian 
field made up of a random superposition of shapes, analytically determine the expected 
non-vanishing cumulants, and investigate the sensitivity of our criterion for that model in 
presence of Gaussian backgrounds. The necessary formalism is developed in Appendix 0. 



In Sec. 1^ we present numerical results obtained with several simulated non-Gaussian maps 
using the lowest-order statistic based on the bispectrum. The method is tested on maps of 
point sources, random rectangles, and cosmic strings, mixed with Gaussian background and 
noise. We give brief conclusions about the viability of the proposed method. 

II. CUMULANTS IN FOURIER SPACE 

For simplicity, in this section we consider random fields on a plane; the results can be 
straightforwardly generalized to random fields on a sphere, such as the CMB temperature 
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perturbation AT{(j),6) /T. From the Fourier space viewpoint, a random field f (x) is a 
collection of random variables (modes) Ok, 



«k = ^ y e-^^V (x) dx. (4) 

Homogeneity of the random field (translation invariance in real space) constrains the joint 
distribution of {ok}, namely the moments must satisfy 

(akiOka-.-ak,,) = if ki + ... + k„ 7^ 0, (5) 

and isotropy means that moments are invariant under rotations R in the Fourier space, 

(«ki'3'k2---'^k„) = {0'RkiO'Rk2---0'Rkn) ■ (6) 

(In the spherical case, there is only one condition of invariance under rotations of the sphere.) 
The task of testing Gaussianity of the random field is now translated into checking whether 
all ak are jointly Gaussian distributed. 

In general, one could describe the joint distribution of all Fourier modes by a suitable 
generating functional of moments, 

^ [j (k)] = f^[ dki...dk j ^^"^ (akiak2---ak„) 



n=0 



(exp 



-i j j (k) a^dk ^ , (7) 



from which one recovers all the moments by functional differentiation, 

(ak,ak,...akj = J-J^^yJIJ^f ^^^^^^^ ' 
Here the functional argument j (k) is a complex- valued function of k satisfying j (— k) = 
j* (k). It is convenient to define the general cumulants of the Fourier modes as quantities 
generated by the logarithm of the functional of Eq. (^, 

^'"'"'--'-")-' %,(k.)....,(k,.) "'^'-'"'"'-°- 
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(The notation C^^^ is chosen to be consistent with Eq. (|C9|) of Appendix C while C'-"^ is 
reserved for real-space cumulants.) For example, the distribution of modes of a Gaussian 
random field with power spectrum P (k) is characterized by the generating functional 



Zg [j (k)] = exp 



(10) 



and, as expected, all cumulants C''-"-' of order n > 3 vanish for this distribution. It is easy to 
see that the general cumulants C**-"^ satisfy homogeneity and isotropy conditions similar to 
Eqs. (|D-(^. This is because Eqs. (^-(H) imply that the generating functional Z [j (k)] is in- 
variant under substitutions j (k) — > j {Rk) and j (k) —>■ e^^^j (k) of its functional argument, 
and therefore lnZ[j (k)], the generating functional for cumulants, must also be rotation- 
and translation-invariant. This, for instance, constrains the cumulants C**^") (ki, k„) to 
identically vanish unless ki + ... + k„ = 0. 

Now we turn to statistics of Ok that are relevant for the analysis of data coming from 
one observation. Since a measurement of one realization of the random field gives only one 
set of values {o-k}, we are limited in the kinds of information about the joint distribution of 
{flk} that we can extracted without a priori knowledge. For instance, we cannot efficiently 
test Eqs. (§])-(§) for any particular choice of {kj} because in each case we would have only 
one value: akiak2---Ok„- An inhomogeneous random field that has a "wrong" distribution 
of just one mode ak cannot be distinguished from a homogeneous and isotropic field on the 
basis of one sample. (Of course, one would not expect such an artificial random field to have 
physical relevance.) 

Similarly, we cannot test the hypothesis that any two particular modes Ok^ and Okj 
come from a jointly Gaussian distribution if only one realization of these modes is available. 
Clearly one can efficiently test a distribution for Gaussianity only if one has a large enough 
number of samples of that distribution. Therefore, to obtain any result at all concerning 
the Gaussianity of Ok, we must assume that several of the modes Ok come from the same 
distribution. From the natural assumption of isotropy it follows that the modes Ok within 
a ring of fixed scale |k| = are identically distributed, and thus the ring |k| = A; provides 



a sample that allows us to test their distribution for Gaussianity. This is the assumption 
behind the bispectrum test. Strictly speaking, a negative result of such a test would indicate 
either a non-Gaussian or an anisotropic distribution. However, most theories predict isotropy 
of CMB, and, assuming that all foregrounds are adequately dealt with, we are much less 
interested in detecting anisotropy or inhomogeneity than we are in finding non-Gaussian 
signals. 

We are therefore motivated to assume homogeneity and isotropy and to regard the modes 
Ok at a fixed scale |k| as independent samples of the same joint distribution of modes. This 
assumption also allows us to investigate correlations between different scales, which is the 
main interest of the present article. Consider two rings of modes 2 two fixed scales ki 
and /c2- The values Okj 2 along the two rings may be regarded as independent realizations of 
the joint two-variable distribution. Now we would like to test whether that distribution is 
Gaussian in two variables. 

A standard way to test a sample for Gaussianity is to compute cumulants of various orders 
(see e.g. I^^): for a Gaussian distribution, the cumulants of order > 3 vanish. Cumulants 
of a distribution of two variables (x, y) are quantities labeled by two indices, e.g. Xmn has 
"dimension" x'^y^; the general definition of Xmn can be given by Eqs. (|B3|)-( |B^ of Appendix 



B. We are interested in cumulants that describe cross-correlation between the two variables, 
for instance a nontrivial third-order cross-cumulant is 

X12 = {xy') - (x) (y') - 2 (xy) (y) + 2 (x) {yf . (11) 

Similarly, one can define cumulants of a three- variable distribution {x,y,z), for example 

Xiii = {xyz) - {xy) {z) - {xz) {y) - {yz) {x) + 2 (x) {y) {z) . (12) 

Given a sample of points {xi,yi) of a two-variable distribution, one could estimate 
the cumulant X12 directly by evaluating the sample averages required by Eq. (0). As 
shown in Appendix A, such estimators are generally not unbiased, and for a Gaussian 
field they yield a non-zero expectation value of order A^~^, as given by Eqs. (|Al|) - (|A2|) 
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for one-variable and by Eq. (piO|) for two-variable cumulants. This of course presents no 
problem for likelihood analysis since this expectation value is known for a Gaussian field 
and can be simply subtracted. We also derive the expected variances of cumulant estimators 
first for one-variable cumulants in Appendix A, and then for multivariable and Fourier 
space cumulants in Appendix B. We show in Eqs. ( |A3| ), ( |B11| )-( |BT^ ) that the variance of a 



multivariable cumulant estimator for a sample of simultaneous realizations of a Gaussian 
distribution of n variables {xj} is always of order A^~^ and is given by 



var [xi,..,J = ^..a-" + O (AT-j , (13) 

where ai are dispersions of the variables Xi. This general result confirms and generalizes 
the formula obtained numerically in Ref. for one-variable (n = 1) cumulants. We see 



that since the number A^ of points in the sample is equal to the number of modes at the 
chosen scale |k|, the variance is indeed decreased for larger |k|. 

We could directly apply the cumulant technique to the Fourier modes by taking e.g. x = 
a]i,y = ttk' in Eq. (pA]). The homogeneity constraint Eq. (|^) suggests that of all possible cross- 
cumulants C*'-"-' (ki, k„), only those for which I]i kj = need to be considered as possible 
non-Gaussian signals, all others being "noise" resulting from accidental inhomogeneity of 
the given sample. For instance, the third-order cumulant relating two chosen scales ki 
and k2 should be estimated from the set of triples |aku c^k'^? ctk2} where ki + k'^ + k2 = 
and |ki| = |k'^| = ki, |k2| = k2. (Clearly, the cross-cumulant of the scales ki and ^2 can be 
nonzero only if k2 < 2ki.) This procedure is equivalent to regarding the triples jctki, ^k'^, ak2} 
as realizations of a t/iree-variable distribution for which we are evaluating the cross-cumulant 
Xiii, while in the notation of Eq. (P), Xiii = 

^(3) (k i,k'^,k2). Below we shall denote this 
inter-scale cumulant by xni (^i;^2)- In Appendix B we derive general expressions for the 
variances of such cross-cumulants. The sample size A^ is determined by the number of modes 
in the smallest of the rings ki and k2, and the variance of xiii {ki, ^2) is 

var [xiii {k^, k2)] = N'' [P {k,)]' P {k2) {2k, < k2), 

var [xiii (A;i, ^^2)] = 2N-' [P {k,)f P (^^2) {2k, = k2). (14) 
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We show in Appendix B that Fourier mode cumulant estimators for a Gaussian field are 
always uncorrelated Gaussian random variables, up to terms of order A^~^. The Gaussianity 
hypothesis will hold for a distribution if its cumulants estimated from the sample of data 
points fall within their sample variances. The likelihood analysis in our case consists of 
computing the chosen cumulant estimators (such as Xiii) ^^i various pairs {ki, of scales 
and normalizing them to their theoretical Gaussian variances; the power spectrum needs 
to be estimated beforehand from the same map. The resulting set of normalized cumulant 
estimators contains as many numbers as there are pairs of different scales {ki, /C2) in the map 
and should be a set of normally distributed, independent random values, if the underlying 
map is Gaussian. We propose to test this by computing the first few cumulants of that set, 
which is at least equivalent to joint estimation of Gaussianity of the inter-scale cumulants 
for all pairs of scales. 



III. A NON-GAUSSIAN MODEL 

To estimate the sensitivity of the Fourier cross-correlations to non-Gaussian signal, we 
use the model of randomly superimposed shapes similar to that of Ref. [^. The random 
field / (x) is constructed as a superposition of "seeds" , i.e. fixed profiles s (x) located at 
random points Xj in the 2-dimensional space. In addition, the profiles are attenuated, scaled 
and rotated randomly, 

/ (x; {xfc, Uk, Afc, Rk}) = UkS [K^Rk {x - Xfc)) . (15) 

k 

Here the number of seeds n and seed positions Xj are randomly chosen in some predefined 
way; Uk are attenuation factors, are scale factors, and Rk are rotations, all drawn randomly 
for each seed out of their corresponding distributions (z/) du and p\ (A) dX (rotations are 
uniformly distributed in the rotation group). Such random fields are physically relevant 
since the shapes could come from point sources or individual topological defects, randomly 
positioned in the sky. 
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We would like to find out whether our criterion can distinguish a certain level of this 
non-Gaussian signal in presence of Gaussian background or noise. If we construct the full 
generating functional of the non-Gaussian random field analytically, we can superimpose a 
Gaussian background on it and obtain analytic predictions for the sensitivity of our criterion. 

Nice properties of the Poisson distribution make it possible to obtain the full generating 
functional for the random field with Poisson distributed seeds. It is given by Eq. (|C5|) of 
Appendix C, 

\nZ[J{x)] = -n, + n,J dxorfi?pA (A) c/Ap, (z/) rfz/e-'"/<^"'^("-"°))''(")''" . (16) 

Here ric is the mean density of seeds. The non-Gaussian components of the distribution 
are read directly from the generating functional which shows that, in general, non-Gaussian 
cumulants of all orders are present, cf. Eq. (|G9| ) . As expected from the central limit theorem, 
the relative magnitude of non-Gaussian components decays when the number density of seeds 
grows. 

A more general result relating the generating functionals of the seed distribution and of 
the resulting random field is expressed by Eqs. ( |G20D and ( |C24D . It shows that the generat- 



ing functional of Eq. ( |CB| ) could be expressed analytically because the Poisson distribution 
of seeds is described by an analytic generating functional. In more complicated cases or in 
situations where only a few first seed correlations ^ ( known, the full generat- 

ing functional will not be available but we can still use these general equations to obtain, 
accordingly, the first few cumulants of the resulting random field / (x). 

These results are consistent with the formalism of Ref. where instead of scaling, 
rotation, or attenuation of seed profiles, a distribution of seed masses rrii was considered, 
the shape profile s{x — Xi, rrii) being a function of the seed mass. Our formalism may be 
generalized to treat the shape profile s {x; Xi) as an arbitrary function of Xj where the 
"coordinate" parameter includes the position of the seed as well as its mass, scaling etc., 
and the seed distribution must be specified with respect to this generalized parameter. An 
analogue of Eq. ( P24|) will hold also in this case; the generating functional of the random 
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field will be simply related to the generating functional of seeds. 

Similar results hold also for the generating functional Z^^c [j (k)] for the Fourier compo- 
nents of the non-Gaussian random field / (x) [cf. Eq. (|C^)]. We can add to / {x) a Gaussian 
background with a known power spectrum Pcik); the background can be described by a 
generating functional of Eq. (|TUp. The new generating functional will be the product of the 
two and the cumulants will be the sums of the two sets of cumulants. Since the Gaussian 
field has vanishing higher-order cumulants, the only change will be in the power spectrum, 
which becomes 

P{k)=Pr,G{k)+PG{k). (17) 

Consider one cumulant estimator C**^") (^i, /c„) for a certain selected set of scales fcj. Since 
its variance is inversely proportional to the appropriate powers of P (k) as given by Eq. ( |B13[ ), 
while its expectation value is unchanged after adding the Gaussian background, the sensi- 
tivity of the cumulant estimator will be diminished by the factor 

This suggests that the sensitivity of cumulant estimators is unaffected on scales where the 
Gaussian background is negligible compared to the non-Gaussian signal, and will be pro- 
portionately decreased otherwise. 

IV. NUMERICAL RESULTS 

In this section we present the numerical procedures and results of testing the sensitivity 
of our method. 

A. The scheme 

The procedure for testing a CMB map for Gaussianity using the Fourier space cumulant 
criterion described in the previous sections can be summarized as follows: 
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1. Fourier transform the map to obtain the modes {ok} (this step would be unnecessary 
for data obtained from interferometers). Estimate the power spectrum P {k). 

2. Divide the Fourier domain into rings of fixed moduh k = |k|. For each radius ki going 
from fcmin = O.lfcjnax to fcmax, whcre fcmax IS the highest accessible wavenumber, and for 
each radius k2 going from fcmin to 2ki, we take a pair of modes and a^'^ from the 
ring of radius ki, that is with |ki| = \k.[\ = ki, and a mode from the second ring 
of radius ^2. The angle 6 between the vectors ki and h[ is determined from 

e 



/c2 = 2ki cos ■ 



(19) 



The values of the two modes on the first ring (ak^ and Ok'^ ) and the single mode on the 
second ring (okj) are joined to form a set of three variables {ak^, a^, akj}, with the 
vectors satisfying ki + k'^ + k2 = 0, their configuration specified by a pair of parameters 
(fci, /C2) up to a rotation. Fig. |l| shows one example of such a configuration. 




FIG. 1. A configuration of modes {okn ^k'j ; ^k2} with ki + k']^ + k2 = 0, where |ki| = \\s.i\ = ki 
and |k2| = /c2 are the radii of two rings in Fourier space. 



3. Equation (|T^ with x,y,z = flki , ^k; ? '2k2 is employed to calculate the estimator 
Xiii(ki, k'^, k2) = Xiu{ki, k2), averaging over rotations. The estimator is then nor- 
malized to its standard deviation calculated under the assumption of Gaussianity, 
i.e. divided by the square root of equation (|T^). The resulting estimator is 

Xiuih,k2) 



var [xiii (A;i,/c2) 
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P(ki)py^(k2) 
Here A'" {ki, k2) is the number of available samples. 



(20) 



4. If the underlying random field is Gaussian, then the estimators Xiii(fci, ^2) for all avail- 
able {ki,k2) should be independent Gaussian variables with mean zero and variance 
one. Therefore, we further calculate the first few cumulants of Xm the domain 
(fci, ^2)- These cumulants can be expressed as 



mean 



var 



Xiii 
Xiii 



El-. 
£2= 

£5= 



Xlll(/i:i,/C2)) , 

^iii(A;i,A;2) -£i 
Xni{h,k2) -ii 



Xin(ki,k2) -Si\ j- 3£2, 

Xlll(^l,^2) -£ll ) - 10£2£3 



(21) 
(22) 
(23) 
(24) 
(25) 



Here the ensemble averages are taken over all available pairs of {ki,k2)- We shall 
denote the number of samples in the domain {ki, k-i) as N^. 

5. If the joint distribution of {oki, o^, Cka} is Gaussian, then £„ should have mean 0, 
except £2 having mean 1. They also have variances 



r 1 

var[£n] = T^- 



(26) 



Therefore, if the estimators in of the map depart from their means much further than 
var[£„], we can reject the hypothesis that the map is Gaussian. We normalize the 
quantities £„ to their variances: 



'^i = V^x ^2^ (£2 - 1) , 



UJr,. = 



ni 



in, n = 3,4,5. 



(27) 
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With this procedure, we evaluate the estimators a>„ from CMB maps under investigation. 
For a Gaussian map and sufficiently large N^, the estimators cUn are independent Gaussian 
variables of mean zero and variance one. If some of the observed cUn depart significantly 
from zero, we reject the hypothesis that the map is Gaussian. 

B. The Gaussian background and instrumental noise 

Even if the main underlying mechanism for producing the CMB anisotropics contributes 
a non-Gaussian compoment, the observed CMB may be close to Gaussian due to the central 
limit theorem. We model this by adding a Gaussian background to our simulated non- 
Gaussian map. This Gaussian component is also expected in scenarios where both infiation 
and defects are present. For simplificy, we assume instrumental noise to be Gaussian as well. 
As a first step for testing the method, we compute the estimators cD^ for these two Gaussian 
components and their combination and check numerically that the normalized cumulant 
estimators a)„ lie within the range (—1,1) for these cases. 




FIG. 2. Sample (5°)^ map of Gaussian background. 
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FIG. 3. Plot of the estimators cDn of the (5°)^ Gaussian map mixed with pixel noise as functions 
of the noise fraction /j^oi- '^^^ shaded area is the one-sigma interval — 1 < c<)„, < 1. 

First, the Gaussian background is generated with the power spectrum computed by 
CMBFAST with sample parameters = 0.05, ^^CDM = O-^^' ^ = O-^- ^e show 
one such map in Fig. |^. The instrumental noise is simulated as a Gaussian field with a 
white-noise power spectrum. These two maps are mixed, with the noise being a fraction 



noi 



'noi 



2 I 2 ' 

G noi 



(28) 



where ctq and o"j^Qi denote the RMS amplitudes of the Gaussian background and the white 
noise respectively. We then compute the estimators as defined in Eq. (^), for the value 
of Jyioi going from zero to unity. The results are plotted in Fig. |^. Each curve for an 
estimator Un is a mean of ten independent realizations with error bars corresponding to 
the numerically obtained standard deviation of that estimator. The same applies to later 
figures of this type. For the range < f-^Q\ < 1, we see that all uOn lie within the one-sigma 
region (—1, 1) and the error bars are well confined within the two-sigma region. This verifies 
the reliability of our numerical procedure, as well as the theory about the use of iu)„. For 
reference, we plot the power spectra of the Gaussian background and the instrumental noise 
in Fig. ^, as the solid and dotted lines respectively. 
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FIG. 4. Power spectra of various components presented in Figs. ^, |5|, ^, and 10. Their pixel 
standard deviations are all made equal to 1 for comparison purposes. 

We generated 10,000 realizations of Gaussian background and computed u)n for each 
realization. This allowed us to check the probability distributions of a;„ numerically, and 
we find that they are indeed Gaussian distributed with variances all equal to unity for n = 
1, ...,5. We have also numerically verified that this result is independent of the underlying 
power spectrum of the Gaussian background, as it should be on theoretical grounds. One of 
the advantages of employing a;„ for testing Gaussianity is that their theoretical distributions 
are known, so that one does not need to implement Monte Carlo simulations for the likelihood 
analysis, i.e. to compute the "equivalent Gaussian realizations" from the map being tested. 
In the likelihood analyses below we shall simply shade the one-sigma area — 1 < a>„ < 1 in 
all relevant plots. 



C. Point sources 



One important challenge for all CMB data analysis is to deal with the presence of point 
sources. This usually unwanted component obscures the cosmological non-Gaussian signal 
in most CMB non-Gaussian tests. For this reason it is important to see how this component 
contributes to our estimators u)„. 

We first generate five point sources, each having the power spectrum 
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Ci oc exp [- {rRkf\ , (29) 

where R is the angular size of the field. Thus r indicates the size of a point as a fraction 
of the size of the field, and here we use r = 0.01. A sample map is shown in Fig. ^. The 
map of point sources is superimposed onto the same Gaussian background as in the previous 
subsection, with a fraction of the point sources 

2 

/pnt = ^T^f^' (30) 
+ ^pnt 

where o"p^^ is the RMS amplitude of the point signal. We then compute the estimators 
uJnif pnt) ^ previous case. The results are plotted in Fig. 



FIG. 5. A sample map of random point sources. 




FIG. 6. Estimators a)„ of the map of random point sources mixed with the Gaussian background, 
as functions of the fraction of power in the point sources /pnt- The shaded area is — 1 < ci>„ < 1. 
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As one can see, when the fraction of the point signal is weak enough (/p^t ~ 0.02), its 
non-Gaussianity does not show up. Once this signal becomes stronger (/pnt ~ 0.02), ui 
starts departing from the one-sigma region while the other a;„ follow at larger /p^t- With 
/pnt ~ 0.1, one can use u)i to reject the hypothesis of Gaussianity at a confidence level of 
more than 99%, since the solid line with crosses goes outside the three-sigma range (—3, 3). 

We plotted the power spectrum of the point sources as the dot-dashed line in Fig. |. By 
comparing this line to the solid line (the power spectrum of the Gaussian background), we 
find that for /p^t ~ 0.02, the point signal never comes to dominate on any scale and so 
unsurprisingly passes the Gaussianity test using u)„. Once fp^^ ^ 0.02, the power spectrum 
of the point signal starts to dominate at / ~ 2000, at which stage it fails the test. Thus, 
we conclude that our estimators Un are sensitive to point sources. For this method to 
succeed in real CMB data, therefore, it will be necessary to first remove the point sources 
using any of the available methods (see e.g. [Q), and then compute the estimators a)„. In 
the following analyses, we shall ignore the point sources by assuming that they have been 
removed beforehand. 



D. Test on filled rectangles 

To test the sensitivity of our method to certain types of non-Gaussianity, we first try 
a simple model using filled rectangles. In a (5°)^ field with a resolution of 256^ pixels, we 
generate five filled rectangles, each having a size of 32 x 48 grid spacings. These rectangles 
are shown in Fig. 0. This map is then mixed with the Gaussian background used in Fig. |^, 
with the fraction of rectangles /rect defined similarly to the fraction of point sources /pnt- 
We then computed the estimators ujnifrect) before, and the results are plotted in Fig. ||. 
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FIG. 7. A sample map of filled rectangles (left) and the same map mixed with a Gaussian 
background with a fraction of power in the rectangles /j-ect ~ 0.0005 (right). 




FIG. 8. Plot of the estimators tD„ of the mixed map of rectangles with Gaussian background as 
a function of /rect- The shaded area is —1 < a;„ < 1. 

We find that one can use (^4 to reject the hypothesis of Gaussianity at more than a 
95% confidence level when /j^gct ~ 0.001. This means that non-Gaussian features of the 
rectangles show up once their RMS amplitude is larger than a few percent of the total 
amplitude. Referring to the power spectrum of the rectangles in Fig. ^ (the dashed line), 
we see that once /^ect ~ 0.001, the power of the rectangles never comes to dominate when 
compared with the power of the Gaussian background (the solid line). This explains why 
non-Gaussian features of the rectangles are not visible in ujn when /rect ~ 0.001. 

We also test the sensitivity of our method to the rectangles in the presence of noise. In 
this case we mix three components: the rectangles, the Gaussian background, and the white 



noise. The last two components are exactly the same as the two we used in Section [IV B 
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The noise fraction is fixed as /j,^; = 0.05^, so the rectangles have a fraction 



cr. 



rect ~ ^2 



rect 



rect + + ^noi 



1 - 0.052 _ 



(31) 



The LUn are computed as a function of /rect' results are presented in Fig. 




FIG. 9. Same as Fig. 



0.1 0.2 0.3 0.4 0.5,,, 0.6 0.7 0.8 0.9 1 

f , ( f ^.^=0.05) 
rect ' noi ' 



but mixed with noise with the fraction of power /j^qJ = 0.05^. 



This time we see that the non-Gaussian feature of the rectangles does not show up until 
a higher /^ect- -^^^ example, one can use 0)4 to reject the hypothesis of Gaussianity at 
more then 95% confidence level only when f-^Qd ~ 0.02. Referring to the power spectra 
of the rectangles, the Gaussian background, and the noise in Fig. ^, one finds that when 
/rect ~ 0.02 with f-^^^ = 0.05^, the power of the rectangles is dominated by the power of 
the Gaussian components (the Gaussian background and the noise); but when /j-ect ~ 0.02, 
the power in rectangles starts to dominate at / ~ 4000. This verifies again that for a>„ to 
succeed in detecting a non-Gaussian signal, the power of the the non-Gaussian component 
needs to dominate within at least a certain range of the accessible /. Thus we see that noise 
may reduce the sensitivity of our method for detecting non-Gaussian signals, depending on 
whether the noise dominates the non-Gaussian signal on scales where it originally dominated. 
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E. Detecting cosmic strings 



Finally, we test our method against the string-induced CMB map. We use a toy model 
of Ref. to simulate the string-induced integrated Sachs- Wolfe (ISW) effect. This model 
incorporates most main features of cosmic strings, such as the scaling and self-avoiding 
properties, as well as their wiggliness. One realization of the resulting CMB is shown in 
Fig. TH. The dynamic range is 40 in conformal time starting from last scattering, and the 
angular size is (1°)^ with a resolution of 256^. A Gaussian background is simulated as before. 
These two maps are then linearly summed, with a string fraction f^^j. defined as above. The 



estimators a>„ are calculated as functions of /g^^., and the results are shown in Fig. |Tl 




FIG. 10. Sample (1°)^ maps of the string-induced ISW effect (left) and the same with added 
Gaussian background (right) where the fraction of power in the string component is fg^j- = 0.0001. 



D 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 



FIG. 11. Estimators ii„ of the map of strings mixed with a Gaussian background as functions 
of /g^j.- "^^^ shaded area is — 1 < a)„ < 1. 
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From the sohd hne in the figure (0)2), one can reject the hypothesis of Gaussianity at 
more than a 95% confidence level when f^^j. ^ 0.001. The power spectrum of the Gaussian 
background is shown as the solid line in Fig. ^ while that of the string-induced ISW effect 
is shown with the thick dashed line in the same plot. Again, if we compare these two lines, 
we find that when f^^j. < 0.001, the string-induced perturbations are dominated by the 
Gaussian background on all scales; when /g^j^ > 0.001, the string-induced perturbations 
start to dominate at / ~ 10^. This result is consistent with our previous argument about 
the sensitivity of the estimators uJn- 

In the presence of instrumental noise, this sensitivity will be reduced if the noise domi- 
nates the string-induced perturbations on scales where they originally dominated. In other 
words, the presence of the noise will contribute extra power to the Gaussian component of 
the underlying map, so as to raise the threshold in power beyond which the non-Gaussian 
signal may dominate. This argument is again verified in Fig. |12|, where we mix three compo- 
nents: the string-induced ISW effect, the Gaussian background, and the white noise whose 
strength is 5% of the total RMS amplitude. This gives 



/str 



str 



^str + ^G + ^noi 



1 - 0.052 _ 



(32) 



As one can see, the hypothesis of Gaussianity is rejected at more than a 95% confidence level 
when /g^^ > 0.05, using uj2. This threshold /g^^ ~ 0.05 can be again verified by comparing 
the thick dashed, solid, and dotted lines in Fig. 11. 




0.1 0.2 0.3 0.4 0.5,,, 0.6 0.7 0.8 0.9 1 
f , (f '^^=0.05) 

Str ' noi ' 
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FIG. 12. Same as Fig. 11 but for maps mixed with noise, with a noise fraction = 0.05^. 



By comparing results for strings and rectangles (see section piV D| ), we find that they have 



similar thresholds for rejecting the hypothesis of Gaussianity. This is because strings and 
rectangles have almost identical power spectra and they dominate the Gaussian component 
above similar thresholds and at similar scales. 

By comparing Figs. |ll| and |l^, with Figs. ^, |^ and ^ we find that only string-induced 
perturbations result in a significantly negative value of the estimator Cj^-, while all other 
non-Gaussian components we tried produce only positive a;„ outside the 95% confidence 
region (the two-sigma region). This provides a potential discriminator in distinguishing 
string-induced features from other non-Gaussian signals. 



V. CONCLUSION 

In this paper, we studied the properties of normalized inter-scale cumulants of Fourier 
modes for homogeneous random fields and proposed a new set of statistics to test a small- 
scale CMB map for Gaussianity. We showed that higher-order cumulant estimators of 
Fourier components of Gaussian fields are themselves nearly Gaussian distributed variables 
with zero expectation value and known variances. Therefore, a test of Gaussianity of these 
cumulant estimators computed from a given map constitutes a test of Gaussianity of the 
map. This method is quite general and can be employed on cumulants of any order. However, 
since the usefulness of higher-order cumulants quickly drops with the order, the bispectrum 
components and the 4-th order correlators are the most promising candidates. 

As an application, the statistical estimators that we denoted by Cjn were constructed 
from all available bispectrum components. We prepared simulated maps made up of ran- 
dom point sources, random rectangles, and simulated cosmic string networks, superimposed 
on Gaussian background and pixel noise. By developing an analytic model of fields con- 
taining random superimposed shapes, we showed that inter-scale cumulants of all orders are 
generically present in such fields, and that noise and Gaussian background limits the sensi- 
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tivity of the cumulant estimators on scales where its power spectrum is significant compared 
to the power spectrum of the non-Gaussian component. 

We numerically verified that the discriminators u)„ are sensitive to non- Gaussian signal of 
simulated maps, as well as to point sources. To apply these statistics to detect certain types 
of non-Gaussian signals such as those resulting from defects, one needs therefore to remove 
point sources on scales where they dominate the non-Gaussian components one would like 
to detect. The instrumental noise was also shown to be capable of reducing the sensitivity 
of our new method, due to its extra contribution to the Gaussian component on small 
scales. Nevertheless, as we theoretically showed and numerically verified, our new statistics 
are capable of detecting non-Gaussian component as long as its power spectrum dominates 
within some range of accessible scales. We also found that the string- induced ISW effect, 
unlike other non-Gaussian models we tested, such as point sources, induces a significantly 
negative value of Cj2- This is a potential discriminator of string-induced perturbations. We 
intend to apply this method to characterize the non-Gaussian features resulting from more 
realistic defect models. Even if defects are unimportant as the underlying mechanism for 
structure formation, we could still apply these methods to detect their existence, especially 
when high-accuracy and high-resolution observations become available in the near future. 
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APPENDIX A: DISTRIBUTION OF CUMULANT ESTIMATORS FOR 
INDEPENDENT GAUSSIAN SAMPLES 

To test whether a random variable x is Gaussian distributed, one can estimate higher- 
order cumulants Xn, n > 2 of that variable and check that they vanish within their statistical 
variance. Given a number of realizations of x, one can compute moment estimators fin and 
cumulant estimators Xn for that purpose. However, obtaining the likelihood that the data 
Xi satisfies the hypothesis requires knowledge of the distribution of the cumulant estimators 
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Xn for an underlying hypothetical Gaussian ensemble of realizations of x. If the cumulant 
estimators Xn were themselves Gaussian distributed, one would only need to know their 
ensemble mean values (xn) and covariances {XmXn)- 

In this Appendix we give a self-contained derivation for the covariances of the cumulant 
estimators Xm computed for a set of independent samples of one (Gaussian) random 
variable x with mean /j, and standard deviation a. We show below that to leading order in 
N"'^, the cumulant estimators are biased by 

(X2n) = X2n - ^"'^^''^ + ^ (A^'^ , (Al) 

(X2nH-i> =X2n+i + 0(Ar-2), (A2) 

and are statistically independent, 

iXmXn) - iXm) (Xn) = AT" V^^nlf^^^ + O (aT'^) . (A3) 

Here Xi = 1^, X2 = and x„ = for n > 3. We shall also show that the higher-order corre- 
lations between Xn such as {xiXmXn) are absent up to terms of order O {N~^) and therefore 
for large A^ these estimators can be treated as approximately Gaussian distributed, inde- 
pendent random variables with known means and variances. This is immediately relevant 
to inter-scale cumulants of Fourier space amplitudes, and by using our formalism we show 
that these are also independent. 

Consider a set of A" independent realizations Xi, i = 1,...,N of the Gaussian random 
variable x. This set satisfies 

(xi) = /X, {xiXj) ^ + a'^Sij. (A4) 

The problem at hand is to characterize the estimators for the cumulants Xn of the distribution 
of X. First we take the unbiased estimators /t„ of the moments of x, which are 

/i„^Ar-i^<. (A5) 

i 

The cumulant estimators Xn may be defined through these moment estimators by the usual 
formulae relating cumulants and moments, e.g. for the second and third cumulants 
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X2= fL2- Al, 



(A6) 



X3 = /is - 3/i2/ii + 2/ii, (A7) 



and so on (these standard relations follow from Eqs. (|A12|) -( [AT^ ) below). Defined in this 



manner, the cumulant estimators Xn are however not unbiased, because for instance fi\ in 
Eq. ( |A6|) is a biased estimator for the square of the first moment 



fi) = [N-'E^i^j) =f^l + N-'^'. (A8) 

\ ij I 

We shall show now that the resulting bias in x„ is always of order N~'^ or smaller. 

We need to consider this bias in more detail since the expectation values of Xn are zero 
for n > 2 and also because terms of order O (N^^) dominate the variance of the estimators 
Xn- Terms of order O {N~^) arise in expressions such as Eq. (|A6|) from products of moment 
estimators. By analogy with Eq. ( |A8| ) one obtains 

(AaAfe) = AiayWb + {fJ'a+b - f^af^b) ■ (A9) 

Products of more than two moment estimators will also contain terms of higher order than 
but here we are only interested in the leading terms. For products of three moments, 
we obtain 

iflaflbP'c) = fJ'afJ'bfJ'c + ifJ'a+bfJ'c + f^b+cfJ'a + fJ'c+afJ'b " SfiafJ'bfJ'c) + O (iV^^) , (AlO) 

and it is straightforward to generalize to products of k estimators, 

{fla^...flak) = f^a,--fia, + ^ [f^a,- ■■ f^a,+ay -f^a, " f^a.-f^a^] + O (A^^^) . (All) 

l<i<j<k 

The sum above is performed over all pairs of moments entering the original expression. 

Now we need to find out where and how such terms appear in expressions for cumulant 
estimators. The relation of cumulants Xn to moments yU„ is most easily understood by means 
of the generating function of moments (also called the characteristic function) Z [p] which 
satisfies 
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Because of the normalization Z[0]~^ the generating function Z [p] is defined up to an ir- 
relevant constant factor. For instance, the Gaussian distribution from Eq. ( |A4| ) has the 
generating function 



Z [p] = exp 



2 • 

-—a -tfip 



(A13) 



The cumulants can be expressed (or, equivalently, defined) as 

^-=[%) l^^t^U (A14) 

(note that the constant factor in the definition of Z [p] drops out). The n-th derivative of 
In Z [p] in Eq. ( |A14|) will contain terms such as 

1 d'Zjp] 1 d^Z[p] 
Z[p] dp^ "'Z[p] dp"^ ^ ' 

which give rise to products of moments We would like to replace the moments fin 

by their estimators /i„ and then to combine them pairwise to obtain the O {N~^) terms as 

shown in Eq. (|A11|) . 



We notice that if we take an expression such as (l + N~^^'^fiij ... (l + N^'^^'^fikj and 
expand it in A^~^/^, then the term of order O {N~^) will be equal to the sum over all 
pairs of the product fiifij. This is almost what is required by Eq. ( [AllD , except that 



we also need to somehow convert fiifij into yUj+j. This would happen if /i„ behaved like 
a differential operator (d/dt)^ acting on functions of a dummy variable t. The required 
technical trick is implemented as follows. We first replace the moments /i„ in Eq. (|A14 ) 



by the differential operator (^yU„ + [id/dt]^^. This is done by changing the generating 

function of moments Z [p] to 

^ \V\ t\ ^ E [y^n + ^''^'^'^^ j = ^ b] + exp [M] . (A16) 

Then we introduce another copy of the old generating function of moments, Z [t] , on which 
this new generating function Z \p\ t] will act with dt at the end of the calculation, so that 
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the derivatives [id-t\^ will at the end be replaced again by the moments We now claim 
that the expectation value of cumulant estimators Xn and of their products can be obtained 
up to and including terms of order O {N~^) from the modified generating function Z [p; t] as 

{Xn) = [id,T In Z b; t] Z [t]^^,^, + O (A17) 

and 

{x^Xn) = {[^^,>r\n Z [p'; t]) {[zd,r In Z [p; t]) Z [t],=p,=,=o + O {n^'/') . (A18) 

Here the parameters p, p' and t are set to only after all derivatives have been computed. 
Expectation values of products of three or more cumulant estimators Xn are be obtained in 
the same manner, 

(Xn,...XnJ = {[^^,,r ^^Z\pr, t]) ... {[^^,,r In Z [p,; t]) Z [t]^^^,^^ + O {N''/') . (A19) 



We shall first show that Eqs. ( |A17| )-( |AT8D actually give the desired O {N ^) terms and 



then proceed to compute these terms. 

Consider Eq. ( |A17| ): after evaluating all derivatives in p we would obtain many terms 



of the form of Eq. ( [A15| ) but with the modified generating function Z [p; t] acting (with the 
derivatives in t) on Z [t] , 

/ 1 d^^Z[p;t] 1 d^"Z[p;t]\ 



\Z\p-t] dp^^ Z[p;t] dp^" 
Take one such term, substitute p = and expand in A^~^/^: 

/ 1 d^'Z[p;t] 1 d^''Z[p;t]\ 



Z [4^0 . (A20) 

p=0 



Z[p;t] "'Z\p;t] dp^- 



p=0 



z [4=0 



k=l 
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V i<i<j<n J 

Note that after taking the derivatives in t the O (1) and O (A^~^) terms in Eq. ([A21j ) are 
exactly the ones in Eq. (lATl , while the O (^N-^^^) terms cancel. Therefore each term of 
the form of Eq. (|A20|) yields the desired combination of moments for Eq. ( [A17| ). Note that 
although the modified generating function Z [p; t] gives the correct result in the O {N~^) 
terms, its higher-order expansion terms are not useful. 

The same argument can be shown to hold for Eq. ( [A18| ) and generally for analogous 
expressions for averages of products of several Xn- We only need to take all derivatives in 
the parameters p, p', ... prior to taking the derivative in t in those expansions. 

Our strategy to obtain the expectation values from Eqs. (|A17| )-( |A18|) will be to first 
expand in N~^^'^ up to 0{N^^), then substitute p = in the correct order, and then 
evaluate the derivatives in t. Since we know that the O (1) terms give the unbiased result 
and that terms of order O (^N^-^^'^^ vanish (both of these statements are straightforwardly 
checked by a similar calculation), while the O (N^^^'^j and higher-order terms are not useful 



for us, we only concentrate on the O {N ^) terms. Expanding the logarithm in Eq. ( |A16|) , 
mz b; t) = in Z W + A'-"^^ - ^'-^ + O (N-"^) . (A22) 
we obtain the expression 

(A23) 



which will give the answer to Eq. (|A17|) if we take the derivative in p and substitute p = t = 0. 
The result is 

At this point we need to use a particular generating function Z [p] . For a Gaussian 
variable with the generating function given by Eq. (|A13|) the expectation value becomes 



iXn) =Xn-^ W,^, e-'^' + O . (A25) 



2 

32 



This is the same as Eqs. 

Eq. ( |A24[ ) expresses the cumulant estimator bias for an arbitrary distribution through 
its generating function Z \p\ . Ahhough it can be seen that the bias for a non-Gaussian 
distribution will be different from that of Eq. ( [A25| ) , we do not need its general expression 
since we are only interested in testing the hypothesis of an underlying Gaussian distribution. 

The calculation of the covariances of cumulants (Eq. (|A18|) ) is a little more involved since 
we need to expand both logarithmic terms prior to taking the derivatives in t. We begin 
with one logarithmic term [cf. Eq. (|A23|) ], use the Gaussian generating function Z [t] and 
evaluate the ra-th derivative in p while keeping the t variable intact, 

[i^p]" [Z [t] In Z [p] + [t] exp (-a V) - W exp (-a V - 2(tV ^ 



= Z [t] Xn + iV-^/^Z [t] {-%r a'^r - ^Z [t] [zd,];^^ exp (-a V - 2a'pt) . (A26) 

(We shall not need the full expansion of the last cumbersome derivative.) Then we act on 
this expression with another logarithmic term (from Eq. ( |A22|) ), set t = and take the m-th 
derivative in p. After some straightforward algebra we obtain Eq. (|A3|) : 

iXmXn) - (Xm) (Xn) = N^' [^^,];^, ("O" (T ^ = N'W^S^^ + O {n-') . (A27) 

The same method and Eq. ( |A24|) can be used to compute the expectation values and 
covariances of cumulant estimators also for non-Gaussian distributions as long as the gen- 
erating function Z [p] is given. Expectation values of products of two or more cumulants 
(e.g. {xiXmXn)) can be found as well, although the calculations are cumbersome. However, 
in the case of the underlying Gaussian distribution a simple algebraic consideration shows 
that higher-order correlations between the cumulant estimators Xn are of order O [N""^) or 
smaller, and therefore Xn themselves can be approximately treated as independent Gaussian 
variables. This is found by noticing that the logarithmic derivative operator of Eq. ( |A22[ ) 
contains terms of order O (^N^-^^'^^ and O (N^^) while we are only interested in the O (A^~^) 
terms in the final result. We could introduce a formal operator L„ by 

Ln = [tdX In Z [p, t^^, = xn + iV-'/'A„ + N-'B^ + O {n-'/^) . (A28) 
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The operator coefficients An and i?„ will later act on Z [t] with their derivatives in t evaluated 
only at the end; note that AnZ [t]^^Q = 0. Then Eq. (|A18|) for the correlation between the 
cumulant estimators is rewritten as 

{XmXn) = Ln^LnZ [t],^^ + O [N-''/^) . (A29) 

Selecting the O (N^^) terms in the product gives 

LmLnZ - {LmZ {LnZ [t],^^) = A^A^Z [4^0 + ... (A30) 

where all terms containing i?„ cancel. The only surviving O {N~^) term came from the 
product of two O (^N^^^'^^ non-commuting operator terms, and all other terms contained a 
commuting Xn and canceled after subtracting the product of the expectation values. Now we 
notice that the O (N^-^) terms in a product of more than two operators L„ from Eq. ( |A28| ) 
must contain at least one factor Xn- It follows that all "connected correlators" of more than 
two cumulant estimators, such as 

iXlXmXn) - iXl) iXmXn) " iXra) {XlXn) - {Xn) {XlXm) + 2 {xi) {Xrn) {Xn) (A31) 

(for any /, m, n) contain no O {N~^) terms. Therefore the cumulant estimators Xn in the 
limit of large number of samples N are approximately Gaussian distributed and independent 
variables (up to terms of order O (iV~^)). 

APPENDIX B: PROPERTIES OF MULTIVARIATE CUMULANTS 

Here we examine the estimators Xmn of cumulants of a distribution of two Gaussian 
variables (x, y) which are independent and have variances a1 and cr^. We show that the 
cumulant estimators are approximately independent in the limit of large sample size A^, 
namely that their covariances are 

{XklXn^n) - iXkl) (Xmn) = N-'mW. SkmSln^T^^ (jf + O (A^-^) , (Bl) 

while the means are biased by a quantity of order A^~^, 
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iXki) = O {n-') . (B2) 



Then we show how to generahze the results of the previous Appendix to multivariate cumu 
lants. 

Cumulants of a distribution of two variables (x, y) are defined by analogy with Eq. ([A 141) 



Xmn = [idp]"^ [idgf In Z [p, g]p=g=o ' (B3) 



where Z [p, q] is the generating function of moments of the distribution, 



^^^^ ^ (x-^y") = [zd,r W Z [p, g],=,=o • (B4) 



}p=q=0 ■ 

A Gaussian distribution is characterized by 

1 



Z [p, q] = exp 



2 iP,<l)Cip,qy 



(B5) 



where we multiplied the row and column 2- vectors {p, q) by the appropriate 2x2 correlation 
matrix C. (This matrix is diagonal if the variables (x, are independent, but we shall not 
need this for most of the derivation.) 

We follow the same method of calculation as in Appendix A and introduce a modified 
(operator-valued) generating function 

Z [p, q- t,u] = Z [p, q] + exp [pdt + qd^] (B6) 



to be used instead of Z [p,t] in Eqs. ( |A17| ), ( |A18| ). We omit the calculations since they are 



very similar to those in Appendix A and only cite the results. The bias in the cumulant 
estimators Xmn is described by a formula analogous to Eq. ( |A25| ) , 

iXmn) = Xmn — [^^p]" [«<9g]" CXp - {p, q) C (p, g)^ _ _ . (B7) 



p=q=0 



The covariance of two cumulant estimators Xfc/ and Xmn vanishes unless the cumulants 
are of the same order, k + I = m + m which case it is given by 

{XkiXmn) - (Xki) (Xmn) = [idtf [iduf [iOp]"" [idqf exp [- {t, u) C {p, qf]^^^^^^^^^ ■ (B8) 
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In our case of interest, the covariance matrix C is diagonal, 

C 

and then Eqs. (P^)-(P^) simphfy to 



(B9) 



N- 



2 2 2 2 

-p a^-q ay 



p=q=0 



+ O , (BIO) 



(XklXmn) - (Xkl) (Xmn) = N'' mMdkmSlnCjT + O (N^^) . 



(Bll) 



The expressions of Eqs. (piCI| )- ([BTl] ) are straightforwardly generalized for distributions 
of three and more variables, for example 



var 



(B12) 



For the case of cumulants of Fourier components of a Gaussian random field, the in- 
dividual Fourier modes are independently distributed with variances equal to the power 
spectrum P (k). Therefore the expressions we derived are applicable directly with the sub- 
stitution = P (ki), cXy = P {k2),... for the appropriate modes. A general Fourier cumulant 
C''-"^ (ki, ...,k„) of Eq. (P) with all n vectors kj distinct will be estimated from samples 
obtained by rotating the set of vectors {kj} and is in our notation a cumulant of the 

n- variable distribution of the n Fourier modes a\f_.; its sample variance is therefore 



var 



(ki, k„)l = N-'P {k,) ...P (A;„) + 0(n 



r-2 



(B13) 



APPENDIX C: GENERATING FUNCTIONALS FOR RANDOM 
SUPERPOSITIONS OF SHAPES 

Here we derive the generating functional for the random field resulting from a super- 
position of fixed profiles (shapes), centered at points ("seeds") drawn from some known 
distribution and scaled and rotated randomly. In particular, we show how to use the gen- 
erating functional to obtain the cumulants of Fourier components for such a random field. 
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We first consider the simpler case of tlie Poisson distribution of seeds and then briefly show 
how to generalize the same formalism for non-Poisson distributions of seed centers. 

We work in flat space for simplicity; although we only need the result in two dimensions, 
our derivation applies to any dimensional space. We consider a finite region of space with 
unit area, so that J dx = 1. If we start with a shape of a given profile s (x) and translate it 
to a set of locations xi, x„, the result is 

f {x;xi,...,Xn) = J2^(^ ~ ^k) ■ (CI) 

k 

To compute the generating functional for the distribution / {x; xi, ...,Xn) we need to average 
the following over all numbers n of centers and center positions x^: 

Z [J (x)] = ^exp (^—i J f (x; Xi, x„) J (x) dx-^ ^ . (C2) 

The centers Xi are Poisson distributed with a fixed mean number of centers tIc in the whole 
region, and the probability of having n centers is (nc)" exp (— nc) /n\. The averaging in 
Eq. Q with / (x) defined by Eq. (|CTD then gives 



Z [J (x)] =e """^^—j / dxi...dxnexp (—i / f {x; xi, ...,Xn) J (x) dx 

As usual, the logarithm lnZ[J(x)] of the generating functional generates the cumulants 
C*^") (a^i, Xn) of the distribution, 

lnZ[J(x)]= fc^^)(x,)^^dx,+ [ C^^) [x,,X2)^-^4^!j^dx^dx2 + ... (C4) 

(this can also be taken as a definition of cumulants) and therefore is of most interest for us. 

We now add more variety to the random field / (x; xi, x„) by allowing the shapes s (x) 
to be rotated, scaled and attenuated. The rotation is effected by introducing an orthogonal 
matrix R uniformly distributed in the orthogonal group of (i-dimensional rotations O (rf); 
the scaling by a factor A distributed according to some probability density px (A); and the 
attenuation by multiplying the profile by an overall factor v with probability density py (z/). 
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We give each shape an individual randomly selected rotation angle, scale and attenuation. 
The transformed profile is us {X^^R {x — Xq)) and the generating functional is similar to the 
one above: 

In Z [J (x)] = —Tie ~^ j dxodRpx (A) d\ {v) dv 

X exp (^iu J s (^\~^R {x — xq)^ J (x) dx^ . (C5) 

Here the integration measure dR for rotations is assumed to be normalized to unity, as well 
as with all other integrations. We shall below drop the irrelevant additive constant in 
Eq. (01). 

The non-Gaussian components of the distribution are now read from Eq. (|C5| ). After the 
exponential is expanded in powers of J (x) under the integral, the general cumulant of order 
n is formed from the terms of order J (x)" and can be readily found for any given profile 
s (x) and any assumed distributions of scaling and attenuation. The resulting non-Gaussian 
distribution is homogeneous and isotropic due to averaging over spatial position xq and the 
rotations R. 

To illustrate the method, we extract from Eq. ( |G5| ) the characteristic function / (j) of 
the one-point distribution of / (x; Xi, x„) at a fixed reference point x = x*. This is done 
by substituting J (x) = j6 {x — x^) into Eq. (|C5|) ; the result is the generating function of 
cumulants for the one-point distribution. 

In / (j) = Yl -^Xn = ric (A) / dxp^ {u) dv exp {-ivs (x) j) (C6) 
and the n-th cumulant Xn of that distribution is 

Xn = n,(A)K)(."(^)L- (C7) 

In the same manner, one can obtain the generating functional for the distribution of the 
Fourier components of / (x). By Fourier transforming Eq. (p5|) we obtain 



\YiZ[]{k)] =n,J dxdRpx{X)dXp^{u)diy exp (^-iv j S {-XRk) j (k) 6"'''''-^^^^ . (C8) 
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The n-th Fourier space cumulant C**^"^ (ki, kn) is then given by 

/n 
dRpx (A) dX 5{ki + ... + kn) n ^ {-^Rh) (C9) 

i=l 

Note that since s{k) is a Fourier transform of a real profile function s (x) and the rotations 
R include mirror symmetry k —k, the cumulant of Eq. ( |C9| ) is always real-valued. 

Although Eq. ( |C5| ) shows that cumulants of all orders are generally expected to be 
nonzero, we need to estimate the statistical significance of their deviation from zero com- 
pared to the sample variance. The variance of a cumulant estimator, as derived above, is 
proportional to the appropriate powers of the power spectrum. The power spectrum P {k) 
of the seed-induced distribution can be found from Eq. (|C^ ) as the second-order cumulant 
(7(2) (k,-k), 

P (k) = n, J dRpx (A) dX \s {-XRk)f . (CIO) 

To understand the qualitative behavior of the cumulant estimators, we shall simplify the 
case by assuming that the distribution of scales px (A) is trivial and that the profile s (x) is 
spherically symmetric. In that case, the integrals in Eqs. ( |C9| ) , ( |(J1CI| ) simplify 

C^") {k^, kn) = (z/") ~s (k,) ...5 (A;„) , (Cll) 

P{k) =n,(^u^)\S{k)\\ (C12) 

If the cumulant C^") (ki,...,kn) is estimated using a sample of values, the ratio of the 
expected signal to the standard deviation of the estimator for a Gaussian sample (assuming 
all ki are different) is 

C"''>(h....,k„) _ s/N (v") S(k,)...s{k„) 



The last ratio of various s {k) in Eq. ( |C13| ) is equal to 1 if the homogeneity constraint 



ki + ... + kn = is satisfied; also the ratio containing u should be of order 1 except for 
specially engineered attenuation distributions. We obtain therefore the general result that 
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the sensitivity of an individual cumulant estimator is decreased with the cumulant order 
n and with the expected number ric of seeds in the observation region (the latter is a 
manifestation of the central limit theorem), and grows with the number of sample points 
which is determined by the number of modes Ok at the chosen scales as \/N . We stress 
that Eq. ( |C13| ) provides only an estimate of the sensitivity under simplifying assumptions 
of spherical symmetry and fixed scale of seed profiles. 

Finally, we generalize our formalism to include arbitrary (non-Poisson) distributions of 
seed centers. Similarly to a continuous random field, a distribution of seed centers can be 
fully specified by a suitable generating functional. Constructing it is equivalent to specifying 
all the connected correlation functions (a;i, a;„) of seed positions. Assume that we are 
given all joint probability densities Proh (xi, x„) Ha; dx^ for having a seed at each of the 
positions Xi,...,x„. If we denote the one-point seed density Proh{xi) = Ug^xi), then the 
two-point seed correlation function ^ (xi,X2) is usually defined by 

Prob{xi, X2) = Us (xi) Tis (X2) [1 + ^ {xi,X2)] . (C14) 

The three-point function ^ (xi, 0:2, xs) is then defined from the relation 

Proh (xi, X2, X3) = ris (xi) Us (X2) (xs) [l + i (xi, X2) 

+i (Xi, X3) + ^ (X2, X3) + ^ (Xi, X2, X3)] (C15) 

and similarly for the higher-order correlations. The relation between the joint probability 
densities Prob{xi, ...,x„) and the connected correlation functions ^ (xi, ...,x„) is similar to 
that of moments and cumulants of a continuous random field (except that ^'s in our case 
are multiplied by several factors of n^). We can define the generating functional for the seed 
distribution Zg [J (x)] as the functional that generates the joint probabilities, 

Zs [J (x)] = 1 + E / dx,...dXr,'^-^^^^:f^-^Prob (xi, x„) . (C16) 



One can see from Eqs. ( |C14| ), ( |C15| ) that the connected correlation functions ^ (xi, X2, ...) 



are generated by the logarithm of Zs, similarly to the usual cumulants except for the nor- 
malization to the one-point density Ug: 
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C (-^l; •••5 •^n) 



5" 1 z 

5J{Xi)...5J{Xn) ^ 



Us [x] 



J{x) 



(C17) 



J=0 



Conversely, if we are given the full set of connected correlators ^ (xi, ...,x„) , > 2, we use 
Eq. ( |C17| ) to construct the generating functional for seeds [J] (we would need to formally 
introduce the one-point correlators as ^ (xi) = 1 to use Eq. ( |C17| ) with n = 1 and also fix 
Z,[J = 0] = 1). 

A simple example is a Poisson distribution where the seeds are completely uncorrelated, 
^ (xi, ...,Xn) = for n > 2 and the distribution is completely specified by the seed density 
Hg (x). We obtain a generating functional 



Z^'^ [J (x)] = exp —i J Ug (x) J (x) dx 



(C18) 



It turns out that given any generating functional of seeds Zg [J] , one can directly obtain 
the generating functional Z [J] for the "seeded" random field of Eq. ( |CT| ) . To show this, 
we compare the definitions of both functionals. The definition of Z [J] in Eq. (|C^ ) is the 
average of the quantity 



exp (^-i J f (x; xi, x„) J (x) dx^ = 11-^ i^k] J) 



E (x; J) = exp 



i J (x') s (x' — x) dx' 



(C19) 



(C20) 



over all seed numbers n and seed positions Xk- The average is weighed by the probabilities 
P (xi, x„) rifc dxk of having seeds at points xi,...,x„ and nowhere else. These probabilities 
are related to the probability densities Prob (xi, x„) by 



P (xi, x„) = P (0) Prob (xi, x„) 



(C21) 



where P (0) = exp [— / Ug (x) dx] is the (finite) probability of having no seeds anywhere. 
Then Eq. ( |(J2D can be rewritten as 



+ 



Z [J (x)] = P (0) 1 + J dxiProb (xi) E (xi; J) 
dxidx2 



2! 



-Prob (xi, X2) E (xi; J) E (x2; J) + ... 



(C22) 
(C23) 
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We notice that Eqs. ( |C16|) and (|C23| ) are essentially the same, and therefore 



Z[J{x)] = P{^)Zs[iE (x; J)] 



(C24) 



Eq. ( |C24| ) is the main relation which allowed us above to obtain the analytic generating 
functional for the Poisson distribution of seeds, cf. Eqs. (P5D, ( |C18| ) with Ug (x) = const. 
The generating functional Z [j {k)] for the Fourier modes of the random field is obtained in 
a similar way. 



Z[j(fc)] = P(0)Z, iE{x;3) , E(x;j(k))=exp 



i / J (k) ~s (k) e-*'^"rfk 



(C25) 



These simple relations are quite general and allow to compute the generating functional 
Z [i {k)] for arbitrary distribution of seeds given by a generating functional Zg [J (x)]. 
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